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In this paper we present the core of LoCo, a logic-based high-level representation language for 
expressing configuration problems. LoCo shall allow to model these problems in an intuitive 
and declarative way, the dynamic aspects of configuration notwithstanding. Our logic enforces 
that configurations contain only finitely many components and reasoning can be reduced to the 
task of model construction. 

1 Configuration Problems 

Configuration systems are one of the most successful applications of Al-techniques. In industrial 
environments, they support the configuration of complex products and, compared to manual pro- 
cesses, help to reduce error rates and increase throughput II12II . The following definition by Mittal 
and Frayman 1 1 1 Oil describes what is typically meant by a configuration problem. 

Definition 1 (Configuration Problem) Given: A fixed, predefined set of components, where a com- 
ponent is described by a set of properties, ports for connecting it to other components, constraints at each 
port that describe the components that can be connected at that port, and other structural constraints, 
some description of the desired configuration and some criteria for making optimal selections. 
Build: One or more configurations that satisfy all the requirements, where a configuration is a set of 
components and a description of the connections between the components in the set, or, detect inconsis- 
tencies in the requirements. 

In typical configuration problems, the number of components needed for a solution is unknown 
beforehand; for example, for some components this number depends on the choices made for other 
components. One can think of this as of creating new components on-the-fly throughout the solving 
process. Existing knowledge representation (KR) tools able to express this dynamic aspect of con- 
figuration require that explicit bounds on all generated components be given as well as extensive 
knowledge about the underlying solving algorithms. 

In this work we introduce a purely declarative logical formalism where the KR engineer only 
has to specify the possible numbers of connections between any two component kinds. From this 
information finite bounds on the number of components needed in a configuration are inferred 
— that is, in any model of the configuration problem the number of components used is finite. 
Formally this logic is a fragment of classical First Order Logic (FO), extended by existential counting 
quantifiers. We plan to eventually develop translations from the logic representation into a low-level 
input format for various solvers, e.g. SAT or Integer Programming. 
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2 Configuration Formalisms 

Over the years several different approaches for configuration have been investigated, e.g. expert 
systems, rule-based systems, non-monotonic reasoning, case-based reasoning, description logics and 
constraint processing. A recent survey is given by Junker in [6]] . 

2.1 Constraint-Based Formalisms 

Constraint satisfaction problems (CSPs) are currently the most widely used approach for the formali- 
sation of configuration problems. However, the standard CSP formulation does not feature variables 
or sub-CSPs that are conditionally activated depending upon the values assigned to other variables. 

Hence, in the area of constraint-based configuration, a number of extensions of the traditional 
CSP paradigm have been developed. In Conditional CSPs [9] activation constraints ensure that only 
a relevant subset of the variables and constraints is used for generating a solution. In Composite 
CSPs [ 1 1 J variables can have subproblems (sub-CSPs) as values. In both formalisms the number 
of possibly activated variables and constraints has to be defined in advance. Accordingly, both 
formalisms admit translations into classic CSPs II14II . 

A Generative CSP (GCSP) [13] allows the dynamic generation of components on demand during 
the search process. The reasoning starts from certain key components and then required auxiliary 
components and associated connections are incrementally added. No explicit bounds on the number 
of components have to be given and the formalism allows infinite configurations to be constructed. 

2.2 Logical Frameworks 

There have also been some previous attempts to capture configuration with logic-based formalisms. 
We recall these in some detail, as they are the starting points for our configuration logic. 

Classical CSPs correspond to the fragment 3FO A + of FO A + of FO consisting of formulae built 
using only existential quantification and conjunction H7| : 

Definition 2 The logical counterpart of a CSP is defined as a pair (<f),T>\ where V is the constraint 
database, i.e., the extensional representation of all the constraint relations and 4> is a 3FO A + sentence. 
Solving the CSP corresponds to deciding whether V\=<$>. 

In the work by Gottlob et al. [5] logical implication has been added to this formalism to express 
the conditional inclusion of components into configurations. This 3FO_> A+ fragment of FO is one of 
the starting points for our own formalism. For example, it allows us to ask whether V t= (3x)Car(x) A 
(LuxuryCar(x) => HasSunRoof(x)). A drawback of 3FO_ A + is that explicit bounds on the number 
of components needed has to be given (variables have a fixed finite domain) and that all constraints 
must be coded in extension in the constraint database. 

There are also two prominent formalisms based on Description Logics (DLs): The works by 
McGuinness et al. [JH]] and Klein et al. [HJ . These are the other two starting points of our formalism. 

In both works valid configurations are described using DL axioms. DLs are fragments of FO 
based on unary and binary predicates, so-called concepts and roles. In both approaches concepts 
are used for describing components and attributes; roles are used to describe the relations between 
components and also between components and attributes. Klein et al. reduce the task of finding a 
valid configuration to the problem of constructing a finite model of the axioms. McGuinness et al. 
propose an interactive approach where (1) the knowledge engineer adds atomic propositions to the 
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axioms and (2) the inference engine computes the consequences until (3) eventually a finite model 
is obtained. The DLs from both formalisms always admit both finite and infinite models; hence no 
explicit bound on the number of components has to be given. The absence of predicates of arity 
greater than two can make domain encodings unnecessarily complex. 

Finally in [4] a logic-based formulation of GCSPs has been given; this formulation does not 
require that bounds on the component numbers be given, but admits infinite configuration models. 

3 The LoCo Formalism 

We now introduce the core of LoCo, a new logic-based framework for modelling practical configu- 
ration problems. In this work we do not yet address ports or optimal configurations. The basic idea 
is to describe a configuration problem (the problem domain) by a set of logical sentences. The task 
of finding a configuration is then reduced to the problem of finding a model for the logical sentences 

— this is the same approach as the one taken by Klein et al. [U . From Gottlob et al. we take the idea 
to express the conditional existence of components in configurations via implication and existential 
quantifiers. However, we use counting quantifiers for this, and these are already present in the work 
by McGuinness et al. (albeit used for a different purpose). The main idea of LoCo is that via these 
counting quantifiers we can enforce that each model of the configuration problem contains finitely 
many components only. 

3.1 Formal Basics 

Formally, LoCo is based on a fragment of classical logic with equality interpreted as identity. This 
fragment is then extended with existential counting quantifiers. 

Components: Components are modelled as n-ary predicates Component(id,x), with id the compo- 
nent's identifier, and x a vector of component attributes. Components are of various kinds; we will 
denote individual kinds by C 1 ,C 2 . 

Typed Variables: It is convenient to say that the different arguments of components have different 
types. We will introduce one type Id for each identifier of a component kind and also for each 
attribute type. We assume that there are only finitely many different types in the configuration 
domain that are all mutually disjoint. In our notation we will use typed variables in formulas. 

We now show how these typed variables can be accommodated in classical first order logic — 

- this is very similar to the reduction of many-sorted logic to classical FO (cf. e.g. flU). We first 
introduce unary predicates for each type (e.g. ID for type Id) and add domain partitioning axioms: 

(Vx) \/ T(x), 

Teryves 

(Vx) /\ -(T l (x)AT ; (x)). 

TijjGTyves.i^j 

Then for transforming a typed formula to an untyped one we replace e.g. each subformula 
(V/d)0(id) by (Vx)ID(x) => </>(x) and likewise (3id)0(id) by (3x)ID(x) A</>(x) — this is the standard 
reduction from many-sorted to classical FO. However, for the moment we are not going to introduce 
types for the terms (other than variables) of the language. Later we are going to stipulate that there 
are standard names for the elements in the domain of each type, cf. Section [3T2l 
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Counting Quantifiers: For restricting the number of potential connections between components we 
use existential counting quantifiers 3" with lower and upper bounds I and u such that I < u, I > 
and u > 0. For example, we might have a formula 3"x0(x) enforcing that the number of different 
x (here x denotes a sequence of variables) such that (p(x) is restricted to be within the range [l,u]. 
In classical logic without counting quantifiers this can be expressed as 

\/ [(3x 1 ,x 2 ,...,x n )[^(x 1 )A0(x 2 )A...A0(x n )]A[/\x i #x i ]A[(Vx)^(x)^Vx = x i ]]. 

l<n<u ij^j i 

As usual quantifiers range over a single type only. But occasionally, by an abuse of notation we 
will write e.g. 3"x^)(x) Vi/j(x), where cp and ip expect different types. This abbreviates a formula 
enforcing that the total number of objects such that cp or ip is between I and u, where the disjunction 
is inclusive. We denote exclusive disjunction between types in these subformulas by 3i"x0(x)V 
ip(x) — this abbreviates a formula enforcing that the total number of objects such that cp is between 
I and u and there are no x such that ip (or the other way around) . 

Connections: Configuration is about connecting components: For every set {C 1 ,C 2 } of potentially 
connected components we introduce one of the binary predicate symbols C^C^ and C 2 2C 1 - it does 
not matter which. We allow connections from a component type to itself, i.e., C2C. A predicate 
C;2C ; - is of type Id ; x Id,-. For every connection predicate C 1 2C 2 two formulas are included \j 

(\/id 1 ,x)C 1 (id 1 ,x')^ (1) 
(3 £ Af 2 ) C 1 2C 2 0d 1 ,id 2 ) A C 2 (id 2 J) A 4>{id l} id 2 , x,y ) 

(V/d 2 ,x)C 2 (id 2 ,x)^ (2) 
(3" 2 2 id l ) C 1 2C 2 0d 1 , id 2 ) A C 1 {id 1 , y ) A ip(id 1 , id 2 , x, y) 

The first formula says how many components of kind C 2 can be connected to any given compo- 
nent of kind C-y, with the subformula cp (with variables among idi,id 2 ,x,y) expressing additional 
constraints, like e.g. an aggregate function J] n < Capacity. The second formula is for the other 
direction. If the connection is from a component kind to itself only one of the formulas is included. 

The formulas cp and ip for expressing constraints on the connections consist of conjunctions and 
disjunctions of linear arithmetic expressions and (in-) equalities between terms. We believe this to 
be sufficient for many practical examples; if necessary we will broaden the language, but we have 
to keep in mind the planned translation to executable formats 

Next to the rules for binary connections, there are also rules for supporting one-to-many con- 
nections ©, i.e. connecting one component with a set of components. For every one-to-many 
connection the component on the left-hand side needs to have binary connections to all components 
in the set on the right-hand side. This is mandatory for the propagation of bounds and will be 
discussed later on. Note also that the single component is not allowed to be part of the set. 

(Vid,x)C(id ; x)=>(3^<fi) [yC2C i (id,id i )AC i (id i ,y)] (3) 



^Throughout this paper free variables in formulas are to be read as existentially quantified from the outside. 
2 For an explicit model of ports in LoCo we can introduce attribute types for the ports and a binary predicate 
ConnectionPorts that is then used in cp . We do not do so here in order to simplify the presentation. 
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In this rule the quantifier 3" ranges over the i > 1 different Id types. It may also be replaced by 
the 3ij quantifier enforcing that each C x is connected to components of only one of the Q kinds. 
The cardinality upper bound is optional and in combination with the binary connections, a sufficient 
bound can be automatically computed. 

3.2 Specifying Configuration Problems 

The specification CP of a configuration problem in our logic consists of two parts: 

• domain knowledge in the form of the connection axioms, naming schemes, a component 
catalogue and an axiomatisation of arithmetic; and 

• instance knowledge in the form of component domain axioms. 

Below we will speak of input and generated components. The intuition is that for the former we 
know exactly how many are used in a configuration and for the latter we don't. We stipulate that a 
configuration problem always includes at least one component of the input variant. 

3.2.1 Domain Knowledge 

Connection Axioms Connection axioms take the form introduced above. Only in binary connection 
rules we allow the lower bound to be zero in the 3" quantifier, i.e. we can have I = 0. Without 
further conditions this would allow us to include infinitely many components into configurations: 
Assume we have two components Q and C 2 , where each Q is connected to exactly one C 2 , and 
each C 2 is connected to at most one Q . It does not help if we know exactly how many C x there are 
(say n) : Still we can have infinitely many C 2 that are not connected to any of the C x . 

We address this problem as follows: First, the component kinds have to be divided into the 
classes input, generated and both. Then we stipulate that for every rule for binary connections from 
Ci to C 2 with a lower bound of zero: Q is input, or there is some other binary or one-to-many 
connection from Cj with lower bound greater than zero. Then we define a level mapping on the 
component kinds via the connection axioms: Input components are on level zero. On level one 
are those generated components for which there is a (binary or one-to-many) connection axiom 
with lower bound greater than zero from the component to only input components. Level two 
components are grounded in input or level one components, and so on. 

Now any domain knowledge axiomatisation has to fulfil the following property: No matter how 
the subdivision of component kinds into the classes input, generated and both is instantiated there 
has to exist a level mapping of the components such that all components of the generated variety are 
assigned to some finite level. The existence of such a level mapping can be checked by first assigning 
generated to the components belonging to the class both and then doing a graph traversal starting 
from the input components. 

Attribute Naming For all attribute types a naming-scheme is included. For ordinary component 
attributes these take the form © where T is the unary type-predicate for the given type and V is a 
finite set of ground terms, the possible attribute values: 




(4) 
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For component attributes of type Id the naming-scheme has the form ([5]) where ^>(x) is a FO 
formulation of the (infinitely many) possible names of elements in that type. For example, this could 
be a simple numbering axiom of form: (Vx)S(x) => (3n)x = SName(n). 

(Vx)JD(x) (5) 

By default, unique name axioms for all distinct terms are also included. Hence naming-scheme 
axioms of the form Q force the domain of the type to be equal to the set of all terms t such that 
4>(t), whereas the form © only forces the domain of the type to be a subset thereof. 

To sum up, for each component kind the Id attribute is unbounded, but ordinary attributes can 
have only finitely many distinct values. However, in each model of a configuration problem only 
finitely many components will exist. We introduce a new variable type Excess without naming- 
scheme axiom: The names of components not used in a configuration can be discarded by assigning 
them to this type. Finally, for every component kind we introduce an axiom 

(yidi,idj,xy)[ C(id i ,x)AC(iid j ,y)Aid i = id j ] ^> x = y 

expressing the fact that, in database terminology, the respective Id is a key. 
Component Catalogue For each component kind the so-called catalogue contains information on 
the instances that actually can be manufactured. We express this as axioms (where each V t is a tuple 
of ground attribute values) : 

(yid,x)C(id,x) = \J x = Vi 

i 

3.2.2 Instance Knowledge 

On the instance level the components assigned to the class both have to be divided into input and 
generated components. For components C of the input variant we make a closure assumption on 
the domain of the components identifiers: 

(Vx)ID(x) = \f x = ID ; . 

where TV is a finite set of identifiers ID ; and ID is the respective type predicate. This axiom is 
stronger than the naming-scheme for the component; hence, in any model identifiers mentioned in 
the naming-scheme axiom but not in the domain closure axiom will belong to the type Excess. 

Both input and generated components that have to be used in the configuration can be explicitly 
listed as atoms (with possibly uninstantiated, existentially quantified arguments). We also allow 
positive and negative ground connection predicates like, for example, ->Ci2C 2 (ID 1 ,ID 2 ). 

3.3 Finite Model Property 

Next we are going to show that in any model of a configuration domain specification for all compo- 
nents the domain of the Id attribute is finite. 

Proposition 1 (Configurations contain finitely many components) Let CP be a configuration do- 
main specification and 1 be an interpretation such that 1 1= CP. Then for all components the domain 
of the Id attribute is finite in 1. 
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Proof Sketch: Assume that for a component C the domain of the Id attribute is infinite and that C 
is connected to some other component(s) Q via a binary or one-to-many connection such that the 
domain of all the Q is finite in X. Then X is not a model for CP. The existence of a level mapping 
guarantees that each component is grounded in components with finite domains. H 
Calculating upper and lower bounds: In order to be able to transform a problem model into 
e.g. Sat or OPL, we need to know the lower and upper bounds on the number of instances for 
each component of the "generated" variety. For computing these possible domain sizes of generated 
components, we extract Diophantine inequalities from the connection formulas. This builds up on 
the work by Falkner et al. about semantics of UML class diagrams and cardinalities applied to the 
configuration domain [3]] . 

Assume a binary connection defined by formulas (Q]) and ([2]), where Q is an input and C 2 is a 
generated component. We can calculate upper and lower bounds for component C 2 as follows: 



Z 2 * C 2 < n < u 2 * \C 2 \ 



(6) 



Zi * CJ < u 2 * | C 2 1 
Z 2 * \C 2 \ < fi * CJ 



(7) 



The number of possible links n between the components is bounded as shown in ©. From 
this we can derive inequalities representing the relation between C 1 and C 2 (0) . After some simple 

^! 1 *|c 1 r 



combinatorics we get lower bound LB 



and upper bound UB 



Ui * C 



U. 



, resulting in 



formula ([8]) for the bounds of C 2 . It can be seen from the formula that to define a lower resp. 
an upper bound for C 2 , we need the cardinality bounds l 2 resp. u 2 in the direction of Cj. The 
described computation also applies to connections between two generated components, provided 
that component Q has properly defined bounds. In this scenario we insert the lower bound on C 1 
for computing LB and the upper bound on Q for computing UB of C 2 . 



"Zi*Ll c i|J~ 




«i*ri c iii 


<|c 2 |< 




? 2 


"2 





(8) 



<|C|< 



(9) 



In the case of one-to-many connections as shown in formula [3] new bounds are calculated for 
the component on the left-hand side. For this computation we combine a one-to-many connection 
with all existing binary connections between the current component and the components on the 
many-side. In other words, we take the cardinalities from a one-to-many constraint in direction to 
the set and the cardinalities of the binary connections in direction to the current component and 
compute bounds analogously to a simple binary connection (see formula [9]). 
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The above procedures refine the bounds on the domain of components in single connections. 
However, if the domain size of one component is updated, then the domain size of other components 
may have to be updated again. We also have to take into account the one-to-many connections. The 
algorithm introduced below addresses both tasks until eventually for the domain sizes a fixpoint is 
obtained — or a contradiction has been detected. 

First, the algorithm puts all input components on a stack in any order (line [2]). The algorithm 
then takes a component off the stack and iteratively determines all binary connections including 
the current component (line [5]) . We perform a bound computation for each binary connection with 
a generated component and check if new bounds were computed for this generated component 
(line [8]). If this is the case, we update the bounds, taking the maximum of all computed lower 
bounds and the minimum of all computed upper bounds. After an update we check the bounds 
for consistency, i.e. we check that lower bound < upper bound (line [TOl) and put the connected 
component on the stack for further propagation of bounds (line lllD . The algorithm terminates with 
an error whenever bounds become inconsistent. 

If the current component is of type generated, then we also check if there exist one-to-many 
connections to a set of other components and iterate over them (line 1 14D . In the case of one-to- 
many connections new bounds are calculated for the current component and not for the connected 
components. If we obtain new bounds for the current component, we perform an update and a 
consistency check similar to what is done for the binary connections and put the component back 
on the stack again to propagate the new bounds via the binary connections. 

The algorithm then iteratively pops the next component off the stack and does the same com- 
putation step until the stack is empty. The algorithm is guaranteed to terminate and ensures the 
proper computation of maximal lower and minimal upper bounds on all generated components. 

BOUND-PROPAGATION 

1 create an empty stack 

2 put all inpComp e input-components on the stack in any order 

3 while stack is not empty 



4 do currComp <— POP(stacfc) 

5 for all (currComp ,nbComp) e binary-connections 

6 do if nbComp e generated-components 

7 then COMPUTE-BOUNDS(currComp,nbComp) 

8 if NEW-BOUNDS(nbComp) 

9 then UPDATE-BOUNDS(nbComp) 

10 if LB(nbComp) < UB(nbComp) 

11 then PUSH(nbComp, stack) 

12 else REJECT 

13 if currComp e generated-components 

14 then for all (currComp, nbComps) e 1-to-many-connections 

15 do COMPUTE-BOUNDS(currComp,nbComps) 

16 if NEW-BOUNDS(currComp) 

17 then UPDATE-BOUNDS(currComp) 

18 if LB(currComp) < UB(currComp) 

19 then VUSK(currComp,stack) 

20 else REJECT 



21 ACCEPT 
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4 Example: Modified Bin-Packing 

We want to explain our approach by means of a simple Bin-Packing example, where we distinguish 
between two component kinds of Things A and B with all Things having a certain size. The Bins have 
an upper bound on how many Things of each kind can be put into them. Things are input compo- 
nents while the Bins are generated components with the aim of their number being minimised. The 
problem can be described by the following formulas: 



V(j'd XA ,size) C^Mj^size) => (10) 
(3\ id Bin ) C JA 2C Bin (id TA ,id Bin )AC Bin Ud Bin ) 

V(/d Bi J C Bin (id Bin ) (11) 
(3q ^ta) C TA 2C Bin (id TA ,id Bin ) A C JA (id TA ,size) Assize < 5 

V(M TB ,size) C TB (id TB ,size) => (12) 
(3\ id Bin ) C JB 2C Bin Ud TB ,id Bin )AC Bin Ud Bin ) 

V(id Bin ) C Bin (id Bin ) (13) 
(3q id TB) C TB 2C Bin Udni,id Bin ) AC^iMTB^ize) Assize <2 

V(id Bl „) C Bin (id Bin ) => (14) 
(3i id T ) (C JA 2C Bin (id T ,id Bin )AC TA (id T ,y)) V 
(C TB 2C B i n (jd r) jd B i n )AC TB (M r ,y)) 



Formula (I10D states that every ThingA has to be put into exactly one Bin. The backwards- 
direction in formula (II ID determines that a Bin has a total size bound of 5 for ThingA. Up to 5 of 
those things can be put into a Bin in case all those Things have minimum size 1 (hence the cardinality 
upper bound is 5). Formulas (I12D and (I13D analogously define the binary connection for ThingB. 

Assume having an instance with 20 Things of each kind, connection ThingA-Bin gives a lower 
bound of 4 and connection ThingB -Bin gives a lower bound of 10 for component Bin using the 
bound computations defined in © . We take the maximum of all available values, hence the lower 
bound for Bin is 10. Notice that in dill) and (I13D the cardinality lower bounds of the connections 
are defined as zero to express the situation that a Bin could contain only one kind of Thing without 
the other. This results in the fact that we can't compute an upper bound for Bin using the binary 
connections defined so far and this would violate the finite model requirement. In order to express 
that for a Bin to exist it needs to have at least one Thing in it, we define a one-to-many connection 
between Bin and the set of Things jl4D . It is sufficient to only define a lower bound for this connec- 
tion and in conjunction with the binary connections we can now compute an upper bound of 40 for 
a Bin, which would occur in a situation where every Thing would be put in a separate Bin. 
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5 Conclusion and Future Work 

We presented the core of LoCo, a high-level language for modelling configuration problems, includ- 
ing the conditional generation of components. The key feature of the formalism is that the number 
of components used in configurations is bounded implicitly by the possible number of connections 
between components. As a next step we plan to extend LoCo so that it is possible to: 

• express that the presence of one connection in a configuration depends on the presence of 
some other connection; 

• specify arbitrary combinations of components in the rules for one-to-many connections; and 

• incorporate a component taxonomy, where components can be subkinds of other components. 

Once this is completed we plan to translate LoCo to an executable format such as Sat, OPL or 
answer set solving. We also intend to carefully analyse the complexity of e.g. model construction or 
the bounds propagation algorithm in LoCo. 

Acknowledgement We thank Markus Stumptner and Heribert Vollmer for many helpful discussions. 
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